Multilingual Acquisition of Structured Information via Novel Relationship Extraction Models over Diverse Knowledge Sources

نویسنده

Nikesh Lucky Garera

چکیده

This dissertation presents original techniques for a class of problems that can be collectively referred to as relationship extraction. This machine learning task involves extracting tuples from free text, the exemplar instantiations of which help model the target relationship. A wide range of relationships are explored, including semantic relationships between words, their translation equivalents in different languages and encyclopedic facts about named entities. This dissertation explores new relationship extraction models which exploit novel knowledge sources across a diverse set of relationship types in multiple languages. It ties together extraction of diverse relationships in the classic seed-based minimally supervised framework. However, this framework has previously failed to capture information beyond local context such as transitively-derived information, domain constraints and knowledge, correlations among relationships and additional novel knowledge sources. Furthermore, the traditional seed-based learning framework fails to extract non-overt relationships such as an author’s gender or age when they are not explicitly stated.In contrast, some of these non-overt relationships can be inferred

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia

Wikipedia infoboxes are a valuable source of structured knowledge for global knowledge sharing. However, infobox information is very incomplete and imbalanced among the Wikipedias in different languages. It is a promising but challenging problem to utilize the rich structured knowledge from a source language Wikipedia to help complete the missing infoboxes for a target language. In this paper, ...

متن کامل

Graph-Based Weakly-Supervised Methods for Information Extraction & Integration

The variety and complexity of potentially-related data resources available for querying --webpages, databases, data warehouses --has been growing ever more rapidly. There is a growing need to pose integrative queries across multiple such sources, exploiting foreign keys and other means of interlinking data to merge information from diverse sources. This has traditionally been the focus of resea...

متن کامل

Information Extraction from Biomedical Texts: Learning Models with Limited Supervision

Among the application domains of information extraction, the biomedical domain is one of the most important ones. This is due to the large amount of biomedical text sources including the vast scientific literature and collections of patient reports written in natural language. These sources contain a wealth of crucial knowledge that needs to be mined. Typical mining tasks regard entity recognit...

متن کامل

A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features

This paper proposes a novel composite kernel for relation extraction. The composite kernel consists of two individual kernels: an entity kernel that allows for entity-related features and a convolution parse tree kernel that models syntactic information of relation examples. The motivation of our method is to fully utilize the nice properties of kernel methods to explore diverse knowledge for r...

متن کامل

Modern Multilingual and Cross-lingual Information Access Technologies

In this chapter, we describe the state of the art cross-lingual and multilingual strategies and their related areas. In particular, we show a WWW-based information system called MIETTA, which allows uniform and multilingual access to heterogeneous data sources in the tourism domain. The design of the search engine is based on a new cross-lingual framework. The framework integrates a cross-lingu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Multilingual Acquisition of Structured Information via Novel Relationship Extraction Models over Diverse Knowledge Sources

نویسنده

چکیده

منابع مشابه

Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia

Graph-Based Weakly-Supervised Methods for Information Extraction & Integration

Information Extraction from Biomedical Texts: Learning Models with Limited Supervision

A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features

Modern Multilingual and Cross-lingual Information Access Technologies

عنوان ژورنال:

اشتراک گذاری